point cloud semantic segmentation
Reasoning Beyond Points: AVisual Introspective Approach for Few-Shot 3DSegmentation
Point Cloud Few-Shot Semantic Segmentation (PC-FSS) aims to segment unknown categories in query samples using only a small number of annotated support samples. However, scene complexity and insufficient representation of local geometric structures pose significant challenges to PC-FSS. To address these issues, we propose a novel pre-training-free Visual Introspective Prototype Segmentation network (VIP-Seg). Specifically, we design a Visual Introspective Prototype (VIP) module that employs a multi-step reasoning approach to tackle intra-class diversity and domain gaps between support and query sets. The VIP module consists of a Prototype Enhancement Module (PEM) and a Prototype Difference Module (PDM), which work alternately to progressively refine prototypes. The PEM enhances prototype discriminability and reduces intra-class diversity, while the PDM learns common representations from the differences between query and support features, effectively eliminating semantic inconsistencies caused by domain gaps. To further reduce intra-class diversity and enhance point discriminative ability, we propose a Dynamic Power Convolution (DyPowerConv) that leverages learnable power functions to effectively capture local geometric structures and detailed features of point clouds. Extensive experiments on S3DIS and ScanNet demonstrate that our proposed VIP-Seg significantly outperforms current state-of-the-art methods, proving its effectiveness in PC-FSS tasks.
Supplemental Material - Annotator: A Generic Active Learning Baseline for LiDAR Semantic Segmentation
The data is collected in Peking University and uses the same data format as SemanticKITTI. To ensure all tasks are well-defined, we formalize consistent and compatible semantic class vocabulary across the above datasets, ensuring there is a one-to-one mapping between all semantic classes. As for ASFDA and ADA settings, we have an additional warm-up stage, i.e., the network is Both source and target data have a batch size of 16. Both training loss and validation loss consistently decrease over time, indicating effective model training. We report mIoU results across existing AL approaches in Table A3.
Distribution Guidance Network for Weakly Supervised Point Cloud Semantic Segmentation
Our initial investigation identifies which distributions accurately characterize the feature space, subsequently leveraging this priori to guide the alignment of the weakly supervised embeddings. Specifically, we analyze the superiority of the mixture of von Mises-Fisher distributions (moVMF) among several common distribution candidates.
DAGLFNet: Deep Feature Attention Guided Global and Local Feature Fusion for Pseudo-Image Point Cloud Segmentation
Chen, Chuang, Lin, Yi, Wang, Bo, Hu, Jing, Wu, Xi, Ge, Wenyi
Environmental perception systems are crucial for high-precision mapping and autonomous navigation, with LiDAR serving as a core sensor providing accurate 3D point cloud data. Efficiently processing unstructured point clouds while extracting structured semantic information remains a significant challenge. In recent years, numerous pseudo-image-based representation methods have emerged to balance efficiency and performance by fusing 3D point clouds with 2D grids. However, the fundamental inconsistency between the pseudo-image representation and the original 3D information critically undermines 2D-3D feature fusion, posing a primary obstacle for coherent information fusion and leading to poor feature discriminability. This work proposes DAGLFNet, a pseudo-image-based semantic segmentation framework designed to extract discriminative features. It incorporates three key components: first, a Global-Local Feature Fusion Encoding (GL-FFE) module to enhance intra-set local feature correlation and capture global contextual information; second, a Multi-Branch Feature Extraction (MB-FE) network to capture richer neighborhood information and improve the discriminability of contour features; and third, a Feature Fusion via Deep Feature-guided Attention (FFDFA) mechanism to refine cross-channel feature fusion precision. Experimental evaluations demonstrate that DAGLFNet achieves mean Intersection-over-Union (mIoU) scores of 69.9% and 78.7% on the validation sets of SemanticKITTI and nuScenes, respectively. The method achieves an excellent balance between accuracy and efficiency.
Distribution Guidance Network for Weakly Supervised Point Cloud Semantic Segmentation
Our initial investigation identifies which distributions accurately characterize the feature space, subsequently leveraging this priori to guide the alignment of the weakly supervised embeddings. Specifically, we analyze the superiority of the mixture of von Mises-Fisher distributions (moVMF) among several common distribution candidates.